Consensus Based Ensembles of Soft Clusterings
نویسندگان
چکیده
Cluster Ensembles is a framework for combining multiple partitionings obtained from separate clustering runs into a final consensus clustering. This framework has attracted much interest recently because of its numerous practical applications, and a variety of approaches including Graph Partitioning, Maximum Likelihood, Genetic algorithms, and Voting-Merging have been proposed. The vast majority of these approaches accept hard clusterings as input. There are, however, many clustering algorithms such as EM and fuzzy c-means that naturally output soft partitionings of data, and forcibly hardening these partitions before obtaining a consensus potentially involves loss of valuable information. In this paper we propose several consensus algorithms that work on soft clusterings and experiment with many real-life datasets to empirically show that using soft clusterings as input does offer significant advantages, especially when dealing with vertically partitioned data.
منابع مشابه
Cluster Ensembles for High Dimensional Clustering: An Empirical Study
This paper studies cluster ensembles for high dimensional data clustering. We examine three different approaches to constructing cluster ensembles. To address high dimensionality, we focus on ensemble construction methods that build on two popular dimension reduction techniques, random projection and principal component analysis (PCA). We present evidence showing that ensembles generated by ran...
متن کاملCombinación de clusterizadores difusos mediante voto posicional para clustering robusto de documentos
The combination of multiple clustering processes provides a means for building robust document clustering systems. This work focuses on the consolidation of fuzzy clusterings, proposing two consensus functions for soft cluster ensembles based on the Borda and Condorcet positional voting strategies. Experiments conducted on two document corpora reveal that the proposed soft consensus functions a...
متن کاملانتخاب اعضای ترکیب در خوشهبندی ترکیبی با استفاده از رأیگیری
Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemb...
متن کاملA CLUE for CLUster Ensembles
Cluster ensembles are collections of individual solutions to a given clustering problem which are useful or necessary to consider in a wide range of applications. The R package ̃clue provides an extensible computational environment for creating and analyzing cluster ensembles, with basic data structures for representing partitions and hierarchies, and facilities for computing on these, including...
متن کاملSpatially-Aware Comparison and Consensus for Clusterings
This paper proposes a new distance metric between clusterings that incorporates information about the spatial distribution of points and clusters. Our approach builds on the idea of a Hilbert space-based representation of clusters as a combination of the representations of their constituent points. We use this representation and the underlying metric to design a spatially-aware consensus cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Applied Artificial Intelligence
دوره 22 شماره
صفحات -
تاریخ انتشار 2007